Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

format as a annotation by default - separate vocabs (option 3) #1027

Merged
merged 5 commits into from
Nov 24, 2020

Conversation

gregsdennis
Copy link
Member

@gregsdennis gregsdennis commented Nov 21, 2020

There is also this in the "Custom format attributes" section below the edits:

An implementation MUST NOT fail validation or cease processing due to an unknown format attribute.

What is an implementation to do when an unknown format is encountered other than fail?

The context may be around collecting annotations, but it says "validation," so I'm confused. Maybe this needs to be clarified a bit too.


Resolves #1020.
Closes #1023.
Closes #1026.

@Relequestual
Copy link
Member

There is also this in the "Custom format attributes" section below the edits:

An implementation MUST NOT fail validation or cease processing due to an unknown format attribute.

What is an implementation to do when an unknown format is encountered other than fail?

The context may be around collecting annotations, but it says "validation," so I'm confused. Maybe this needs to be clarified a bit too.

For annotation vocab, I would say, collect it as an annotation as long as it's a string. If it's not a string, then fail.

For assertion vocab, FAIL. The set of allowed values is explicit.

@gregsdennis
Copy link
Member Author

gregsdennis commented Nov 21, 2020

I'll add this into the update.

(Validating that the format is a string is a meta-schema responsibility when validating the schema. That relationship always gets confusing.)

Copy link
Member

@Relequestual Relequestual left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we were totally on the same page but close.
I was suggesting we should do away with saying anything about an implementation option to change annotation format to an assertion. It's confusing and was meant to only be a stop-gap till we found a solution.

Either you get annotations (default), and an implementation MAY do post-processing with the annotation results, but there should be no impact on the JSON Schema validation results.

Or you get full semantic validation.

jsonschema-validation.xml Outdated Show resolved Hide resolved
jsonschema-validation.xml Show resolved Hide resolved
@karenetheridge
Copy link
Member

karenetheridge commented Nov 22, 2020

For annotation vocab, I would say, collect it as an annotation as long as it's a string. If it's not a string, then fail.

This can only be done since we know that all existing formats are for instance data that are strings, and no other type. Are we expecting to add additional formats to this vocabulary in the future, or will those have to be added as separate vocabularies?

edit: I wrote this thinking that "it's a string" was referring to the data instance. If instead you meant that the value of the 'format' property itself in the schema was the "it", then yes -- the value of format must always be a string, even if it's unrecognized, because that's what the format metaschema says.

@karenetheridge
Copy link
Member

karenetheridge commented Nov 22, 2020

Another thought -- the format-as-assertion vocabulary should have an enum that lists the allowable values of the format keyword. Any unrecognized value should be a failure, and the metaschema can be used to perform that check.

Also: if we go ahead with this PR, we'll need new meta/*.json files for use by the metaschema (and the $vocabulary entry in the main metaschema file will need updates too).

Copy link
Member

@karenetheridge karenetheridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of some of the specific wording here, but I lean towards favouring the approach in this PR vs. the other alternatives.

@gregsdennis
Copy link
Member Author

gregsdennis commented Nov 22, 2020

the format-as-assertion vocabulary should have an enum that lists the allowable values of the format keyword

This would conflict with the section on custom keywords. We need to allow for support for these, and having an enum of "approved" formats in the meta-schema would prevent that. This test of "known" formats must be performed by the implementation. Shouldn't be too hard, though: "I don't have a handler for this format... FAIL!"

@karenetheridge
Copy link
Member

This would conflict with the section on custom keywords.

I had gotten the impression that we didn't want to continue to support custom formats -- if a user wants a new format they should create a new vocabulary for it. It would definitely be in the spirit of "making format more sane and less hand-wavey".

@gregsdennis
Copy link
Member Author

That's understandable, but custom formats aren't the problem that this PR is trying to resolve. It's easy enough to continue supporting them for now.

@gregsdennis
Copy link
Member Author

As a side note, even though I split the meta-schemas, I haven't updated the draft: they're still 2019-09, but they're aligned with the others.

Copy link
Member

@Relequestual Relequestual left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last change, but noticed the title for the meta-schema was the same for both format related meta-schemas, and this could be confusing.

If you could apply the same suggestions to both, I'll then approve and merge this PR.

meta/format-annotation.json Outdated Show resolved Hide resolved
jsonschema-validation.xml Outdated Show resolved Hide resolved
@Relequestual
Copy link
Member

I'm so annoying, but can you

  • Modify the section title from A Vocabulary for Semantic Content With "format"' to Vocabularies for...
  • Update the general purpose meta-schema to include the new vocabulary URI
  • Add something to the changelog about this

THEN, it will merge. Honest.

@Relequestual
Copy link
Member

Agreed with Greg, I will do the work as required in the above comment after merging this PR.

@Relequestual Relequestual merged commit 2ce4968 into master Nov 24, 2020
@Relequestual
Copy link
Member

I have done the three items as listed above.

@Relequestual Relequestual deleted the separate-format-vocabularies branch November 24, 2020 23:44
@@ -514,18 +514,12 @@
<section title="Foreword">
<t>
Structural validation alone may be insufficient to allow an application to correctly
utilize certain values. The "format" annotation keyword is defined to allow schema
utilize certain values. The "format" annotation keyword is defined to allow schema
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we keep unrelated whitespace changes out of the PR? it adds confusion and often conflicts with other things in flight.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was requested by @Relequestual in Slack.

&lt;https://json-schema.org/draft/2020-11/vocab/format-annotation&gt;. The current
URI for the corresponding meta-schema is:
<eref target="https://json-schema.org/draft/2020-11/meta/format-annotation"/>.
Implementing support for this vocabulary is REQUIRED.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested change (using the proper IETF language):

  •                Implementations MUST support this vocabulary.
    

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUIRED is proper language. RFC 2119 section 1.

&lt;https://json-schema.org/draft/2020-11/vocab/format&gt;.
In addition to the Format-Annotation vocabulary, a secondary vocabulary is available
for custom meta-schemas that defines "format" as an assertion. The URI for the
Format-Assertion vocabulary, is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comma

&lt;https://json-schema.org/draft/2020-11/vocab/format-assertion&gt;. The current
URI for the corresponding meta-schema is:
<eref target="https://json-schema.org/draft/2020-11/meta/format-assertion"/>.
Implementing support for the Format-Assertion vocabulary is OPTIONAL.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested change:

  •                 Implementations MAY choose to support this vocabulary.
    

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OPTIONAL is proper language. RFC 2119 section 5.

jsonschema-validation.xml Show resolved Hide resolved
behavior described in the next section.
Implementations MAY still treat "format" as an assertion in addition to an
annotation and attempt to validate the value's conformance to the specified
semantics. The implementation MUST provide options to enable and disable such
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why MUST they, if it can also be done via $vocabulary? I would soften this to a SHOULD. Ideally, the fewer knobs needed in an implementation outside of specifying a $schema, the better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is solely for when the annotation vocabulary is used. If an implementation supports assertion (even partial assertion) they MAY choose to treat the keyword as such. If they do, they MUST allow this feature to be enabled and disabled.

The assertion vocabulary doesn't have this allowance, instead specifying that implementations MUST implement assertion fully in order to support the vocabulary.

Copy link
Member

@karenetheridge karenetheridge Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused. It sounds like a schema (via the metaschema) can specify to use the format-annotation vocabulary, but it might still get assertion behaviour anyway (when using an implementation that supported validations)? That would force the setting of an out-of-band configuration "no, really, when I say annotations I mean only annotations" and there would be no other way to not get validation behaviour as well. If I'm correct so far, then the configuration value MUST default to false so validation behaviour doesn't happen accidentally.

Copy link
Member Author

@gregsdennis gregsdennis Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema can say, "You're getting annotations."

But this bit of the spec allows the client/application to say, "I know I'm supposed to get annotations, but this library I'm using supports asserting formats, and that'd save me some work, so I'd like to use that." It gives the choice to the client.

It also the opens the door for implementations to provide partial validation, which is a pain point that @handrews had to deal with (pressure from implementers) and what the previous wording of this section tried so hard to allow.

However, if the schema says, "You're getting assertions," then the implementation MUST be able to provide validation or refuse to process the schema.

Copy link
Member Author

@gregsdennis gregsdennis Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last scenario is <format-assertion>: false. This means, "If you support asserting these things, that's great, please do. If not, it's cool; they'll be annotations (because you don't understand them)." But this is enforced by the schema, not opted-in by the application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end, we have

Annotation
Vocab
Assertion
Vocab
Impl Assert
Support
App Config Result
false/true n/a no n/a annotation
false/true n/a yes annotation annotation
false/true n/a yes assertion assertion
n/a false no n/a annotation
n/a false yes annotation/assertion assertion
n/a true no n/a fail
n/a true yes annotation/assertion assertion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this bit of the spec allows the client/application to say, "I know I'm supposed to get annotations, but this library I'm using supports asserting formats, and that'd save me some work, so I'd like to use that." It gives the choice to the client.

My main point here is that if the implementation can do this, the setting MUST default to NOT doing it. Users must explicitly opt in. Otherwise, there is no way of a schema being processed as annotation only (with any combination of $schema and $vocabulary settings) with an implementation that can also handle assertions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the setting MUST default to NOT doing it

Absolutely. That is covered on the next line: "... evaluation and MUST be disabled by default."

<t>
When the Format-Assertion vocabulary is declared with a value of true,
implementations MUST provide full validation support for all of the formats
defined by this specificaion. Implementations that cannot provide full
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "specificaion"

@karenetheridge
Copy link
Member

whoops, looks like I hit submit a bit too late...

@Relequestual
Copy link
Member

Ah nuts sorry @karenetheridge - My internet was super flakey yesterday (and few days before that, and still is).
There are a few obvious fixes needed here. Feel free to open a new PR, or I'll try to cover all your comments in final edits on Friday. =] Many thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

format should not change behavior based on its vocabulary value
3 participants